The bucket box intersection (BBI) algorithm for fast approximative evaluation of diagonal mixture Gaussians
نویسندگان
چکیده
Today, most of the state-of-the-art speech recognizers are based on Hidden Markov modeling. Using semi-continuous or continuous density Hidden Markov Models, the computation of emission probabilities requires the evaluation of mixture Gaussian probability density functions. Since it is very expensive to evaluate all the Gaussians of the mixture density codebook, many recognizers only compute the M most signi cant Gaussians (M = 1; : : : ; 8). This paper presents an alternative approach to approximate mixture Gaussians with diagonal covariance matrices, based on a binary feature space partitioning tree. The proposed algorithm is experimentally evaluated in the context of large vocabulary, speaker independent, spontaneous speech recognition using the JANUS-2 speech recognizer. In the case of mixtures with 50 Gaussians, we achieve a speedup of 2-5 in the computation of HMM emission probabilities, without a ecting the accuracy of the system.
منابع مشابه
Improvements to bucket box intersection algorithm for fast GMM computation in embedded speech recognition systems
Real-time performance is a very important goal for embedded speech recognition systems, where the evaluation of likelihoods for Gaussian mixture models (GMM) usually dominates the computation of a continuous density hidden Markov model (CDHMM) based system. The Bucket Box Intersection (BBI) algorithm is an optimization technique that uses a K-Dimensional binary tree to speed up the score comput...
متن کاملA Comparative Study of Gauss in Large Vocabulary Continuou
Gaussian mixture models are the most popular probability density used in automatic speech recognition. During decoding, often many Gaussians are evaluated. Only a small number of Gaussians contributes significantly to probability. Several promising methods to select relevant Gaussians are known. These methods have different properties in terms of required memory, overhead and quality of selecte...
متن کاملA comparative study of Gaussian selection methods in large vocabulary continuous speech recognition
Gaussian mixture models are the most popular probability density used in automatic speech recognition. During decoding, often many Gaussians are evaluated. Only a small number of Gaussians contributes significantly to probability. Several promising methods to select relevant Gaussians are known. These methods have different properties in terms of required memory, overhead and quality of selecte...
متن کاملFast Gaussian Evaluations in Large Vocabulary Continuous Speech Recognition
Rapid advances in speech recognition theory, as well as computing hardware, have led to the development of machines that can take human speech as input, decode the information content of the speech, and respond accordingly. Real-time performance of such systems is often dominated by the evaluation of likelihoods in the statistical modeling component of the system. Statistical models are typical...
متن کاملSpeeding up the score computation of HMM speech regognizers with the bucket voronoi intersection algorithm
With increasing sizes of speech databases, speech recognizers with huge parameter spaces have become trainable. However, the time and memory requirements for high accuracy re-altime speaker-independent continuous speech recognition will probably not be met by the available hardware for a reasonable price for the next few years. This paper describes the application of the Bucket Voronoi Intersec...
متن کامل